Examining Current Commodity CMPs for Fault Isolation

نویسندگان

  • Nidhi Aggarwal
  • Norman P. Jouppi
  • James E. Smith
چکیده

chip level. In turn, this enables cost benefits from reduced component count. Additionally, enhanced resource sharing leads to better performance. On-chip components can now be easily shared to improve resource utilization, such as core sharing via hyperthreading, shared caches, and I/O interfaces. However, the same features of multicore processors that offer benefits can also present drawbacks. In particular, the increased levels of consolidation and integration lead to important isolation concerns—for performance, security, and fault tolerance. Fault tolerance is an area of major concern. This is a particularly important issue given that recent studies have shown dramatic increases in the number of hardware errors when scaling technology to smaller feature sizes.4 Developers have encountered two main kinds of errors. First, defects in the silicon cause permanent or intermittent hardware faults, resulting in wear out over time and leading to hard errors. Second, electrical noise or external radiation can cause transient faults when, for example, alpha radiation from impurities or gamma radiation from outside changes random bits, leading to soft errors. With CMPs, the fault-tolerance problem is compounded because a fault in any single component can lead to the failure of the entire chip. The failure in time (FIT) of cores, caches, memory, or I/O components combines to provide a high FIT for the CMP. Future CMP Resource sharing in modern chip multiprocessors (multicores) provides many cost and

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

COVERT: Configurable Virtual Redundancy with Transparent Availability on Commodity Software

Overview Scaling integrated circuit technology into the deep submicron regime is expected to increase both soft and hard error rates significantly [1]. Therefore, providing high availability in the presence of relatively unreliable components is likely to become an increasingly important requirement for a diverse set of systems, including general-purpose commodity systems. Traditionally, high a...

متن کامل

Poster: (SF)2I - Structure Field Software Fault Isolation

Commodity operating systems are self-extending, loading code at runtime to add new features. While useful, such self-extensibility allows attackers to inject kernel-level malware into the operating system kernel. Such malware threatens security system-wide and is not yet completely mitigated. This poster demonstrates our approach to provide safe extensibility of commodity operating system kernels.

متن کامل

Online Fault Detection and Isolation Method Based on Belief Rule Base for Industrial Gas Turbines

Real time and accurate fault detection has attracted an increasing attention with a growing demand for higher operational efficiency and safety of industrial gas turbines as complex engineering systems. Current methods based on condition monitoring data have drawbacks in using both expert knowledge and quantitative information for detecting faults. On account of this reason, this paper proposes...

متن کامل

Performance-asymmetry-aware scheduling for Chip Multiprocessors with static core coupling

Thread-level redundancy is an efficient approach for transient fault detection and recovery in Chip Multiprocessors (CMPs), in which two adjacent cores are statically coupled to form a functional Dual Modular Redundancy (DMR). Manufacturing process variations cause core-to-core (C2C) performance asymmetry across the chip, which can be further divided into the asymmetry among core-pairs and the ...

متن کامل

Seismic Rehabilitation of Liquid Storage Tanks using Friction Pendulum Base Isolation subjected to the Near-Fault Ground Motions

Cylindrical liquid storage tanks are considered as vital structures in industrial complex whose nonlinear dynamic behavior is of crucial importance. Some of these structures around the world have demonstrated poor seismic behavior over the last decades. There are several methods and techniques for rehabilitation and reducing damages in these structures which among them passive control devices, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007